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I ntroduction 

My analysis of codon assignment patterns reveals each regularity 
identified in the 'universal' genetic code derives from a deep 
structure linked to path-length in amino acid (aa) synthesis. 
Conserved vestiges of earlier transitional codes were uncovered, 
and they provide an unprecedented insight into the origin of 
proteins and the genetic code in the pre-LCA era. 


Methods 

• Time-order of aa addition to the code was equated with path- 
distance, based on the number of reaction steps from the 
citrate cycle to aa, path-takeovers being discounted. 

• Correlates of tRNA aa specificity in consensus LCA tRNA base 
sequences and secondary structure were linked to a mechanism 
coordinating code evolution with the growth of aa pathways. 

• Pre-LCA protein evolution was reconstructed from changes 
identified in the aa alphabet, codon assignment patterns, and 
phylogenetically determined LCA protein residue sequence. 


Results 
Phases in Amino Acid Syntfiesis and Codon Allocation 


Code Domains 


NH4 alkyl, hydroxy 
fixers S- containing 


basic, aromatic 


[ 

Asp 
Glu 

Ala'Cys Me 

N = 3.7e°'' 

- 

Asn 

pT5TS*y^-_J,eu_ 

Phe 

- 

Gin 

1 1 

SerVal'ThrMet 

1 1 1 1 

ArgTiyinyr— ttisi^ 




no. code 
boxes 


il.. 



_ 


GCC 



^12 

Z 

8 

- 


GCAV 

gcg\ 

cccudc 

ccuuglK 

AUC 
AUU 

N = SOe-"""- 

_ 

GAC 

CCA GGC 

\ AUA 




GAU 

ccgggu 

\yG 


_ 

GAA 

UCCGGA 

ciAs. 

AGA 

6 


GAG 

UCUGGG 

cuu 


Z 4 

- 

AAC 

AAU 

UCAGUCACCCUA 
UCGGUUACUCUG 

ras. uuc 

CGU^^44UU 


" 

CAA 
CAG 

AGCGUAACAUUA 
AGUGUGACGUUG 

CGA AAA Ua5---.,CAC 
CGGAAGUAU CALTDG 
1 1 1 1 1 1 


Path length (L) 


• Three phases of aa synthesis and codon allocation occur: 

(1) Four NH4^ fixer/ N-donor aa (1-2 step paths) are encoded 
by NAN (N, any base) column triplets. They copolymerize to 
form polyanionic, amide-bearing peptide chains. 

(2) Nine aa (4-7 steps), plus Ala^ with alkyl, hydroxyl or S- 
containing side chains follow. Consensus codons (bold mid- 
base) reveal columnwise code expansion, NCN->NGN^NUN, 
accompanied extension of the consensus aa path from 
4->5^7 steps. Codon maps for precursor/ product pairs in 
extant aa synthesis pathways corroborate this pattern. 

I ncreasingly hydrophobic aa entered the expanding code, 
prolonging protein residence-time on a cationic surface 
(Fajans-Paneth principle). Membrane and globular proteins, 
and Woese (1965) clusters, result. Six smallest aa ( ange) 
acquired 6/ 8 intact 'error-free' code boxes (Taylor & 
Coates, 1989; Lim & Curran, 2001), on entering the code 
first, during expansion from the NH,^ Fixers Code. 

(3) Six basic and aromatic aa (9-14 step paths) then acquired 
codons in subdivided 'error-prone' boxes, shared with 
short-path aa, consistent with overprinting of the code by 
latecomers. 

• Elevated codon mid-base coding capacity (Perlwitz et al., 
1988) correlates with most (10/ 20) aa entering the code 
by mid-base substitutions during columnwise expansion. 

• An exponential fall-off in codon assignments implies a gradual 
slowing in the tempo of change preceded freezing of the code. 


• Five domains of 
contiguous codons 
read by tRNA, with 
same core struc- 
ture group (ron>an 
numeral, letter), 
specific for same 
family aa is a feature 
of code organization 
(background colors). 

• Row-wise domain 
orientation is 
consonant with 

code expansion. 

• Arrows connect 
tRNA to ancestor; 
identity, in quarts, is 
shown. Superscripts 
are aa path-distances 


• Back-tracking from 
A-rich codons of 
NH4+ Fixers Code 
reveals the first 
proteins were 
random copolymers 
assembled on a 
poly(A) template, 
read by an ancestral 
universal acceptor, 
tRNA-Dl*»P.= i'''*^".= i'" 
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• Selection for oligopeptides, combining surface-binder and N atom donor 
functions, is credited with driving evolution of the first code. 

• Synthesis of regio-specific oligopeptides, with attachment and donor 
sites, necessitated 5'-initiation of translation and code expansion, with 
AUN triplets seemingly assigned to an a-aa intermediate of methionine. 


Residue 'Code Age' in Pre-LCA Protein Evolution 


• Time-of-origin 
timeline for pre-LCA 
proteins was 
established from 
the goodness of fit 
between residues, 
in LCA proteins 
and aa alphabet 
(above), at stages 
identified in code 

Fd, ferredoxin; 
PL-hl, proteolipid 
helix- 1; FtsZ, pro- 
karyote septation 
protein; FEN-1, 
flap exonuclease; 
RNAP-a, RNA 
polymerase a-sub- 
unit; RNR, ribo- 
nucleotide reductase; 
Topo-I , topoisomer- 
ase-l; DNAP, DNA 
polymerase (pol I, 
catalytic site) 
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Monophyletic Pre-LCA tRNA Specific for Same-Family Amino Acids 
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• Pre-LCA tRNA from a common ancestor, with anticodons (3'->5' bases) cognate 
with nearest-neighbor codons and same core-structure group (in brackets), were 
adaptors for aa from the same synthesis family. This shows tRNA to be cofactors 
in early aa synthesis. Blue branches, tRNA for aa derived from a-ketoglutarate; 

c , oxaloacetate; green, pyruvate; brown, phoshoglycerate; , shikimate. 

• Code age of acylating aa, from synthesis path-length, increased linearly with 
mutation distance of tRNA species from tree root (branch 1), at base of tree. 


• A 23-residue antecedent of low potential ferredoxin, 
Pro-Ferredoxin-5 (code age, stage 5.6), possessed a 
negatively charged 7-residue N-terminal 'surface 
attachment' segment and 16-residue C-terminal 
[4Fe-4S] cofactor binding segment. In binding an 
inorganic cofactor to a cationic mineral surface, Pro- 
Fd-5 is a prototype pre-cell surface-adaptor protein. 

• Pro-Fd-5 has been reconstructed at the Denmark 
Technical University (Christophersen, 2004). 


Conclusion 

• Protein synthesis and the genetic code originated at the 
point of entry of N atoms into a primordial molecular 
network on a cationic mineral surface system, replacing 
N entry solely through free aa formed on amination of 
two reverse (autocatalytic) citrate cycle components. 

• Code structure is highly ordered. It conserves evidence 
of early codes, tRNA diversification and phases in aa 
synthesis. Time of basic aa entry and PL-hlorigin imply 
the code was about half-formed when cells first arose. 


